在与用户进行交流时,以任务为导向的对话系统必须根据对话历史记录在每个回合时跟踪用户的需求。这个称为对话状态跟踪(DST)的过程至关重要,因为它直接告知下游对话政策。近年来,DST引起了很大的兴趣,文本到文本范式作为受欢迎的方法。在本评论论文中,我们首先介绍任务及其相关的数据集。然后,考虑到最近出版的大量出版物,我们确定了2021 - 2022年研究的重点和研究进展。尽管神经方法已经取得了重大进展,但我们认为对话系统(例如概括性)的某些关键方面仍未得到充实。为了激励未来的研究,我们提出了几种研究途径。
translated by 谷歌翻译
我们旨在使用大量自动转录语音来改进口语建模(LM)。我们利用INA(法国国家视听学院)的收藏,并在350,000小时的电视节目中应用ASR后获得19GB的文本。由此,通过微调现有的LM(FLAUBERT)或通过从头开始训练LM来培训口语模型。新模型(Flaubert-Oral)与社区共享,并评估了3个下游任务:口语理解,电视节目的分类和语音句法解析。结果表明,与最初的Flaubert版本相比,Flaubert-Oral可能是有益的,表明尽管其固有的嘈杂性,但ASR生成的文本仍可用于构建口头语言模型。
translated by 谷歌翻译
网格(医学主题标题)是由国家医学图书馆创建的大型叙述,用于在生物医学领域的出版物的细粒度指数。在Covid-19大流行的背景下,已经与在相应主题发表的文章相关的网格描述符。零拍分类是一种充分的响应,用于及时标记网格类别的论文流。在这项工作中,我们假设网格中可用的丰富语义信息有可能改善Biobert表示,并使它们更适合零射击/几次拍摄任务。我们将问题框架确定为确定网格术语定义,与纸张摘要连接是有效的实例,而且利用多任务学习诱导表示表示的网格层次结构归功于SEQ2Seq任务。结果在Medline和LitCovid数据集上建立基线,探测结果表明所得到的表示传达了网格中存在的分层关系。
translated by 谷歌翻译
This work focuses on unsupervised representation learning in person re-identification (ReID). Recent self-supervised contrastive learning methods learn invariance by maximizing the representation similarity between two augmented views of a same image. However, traditional data augmentation may bring to the fore undesirable distortions on identity features, which is not always favorable in id-sensitive ReID tasks. In this paper, we propose to replace traditional data augmentation with a generative adversarial network (GAN) that is targeted to generate augmented views for contrastive learning. A 3D mesh guided person image generator is proposed to disentangle a person image into id-related and id-unrelated features. Deviating from previous GAN-based ReID methods that only work in id-unrelated space (pose and camera style), we conduct GAN-based augmentation on both id-unrelated and id-related features. We further propose specific contrastive losses to help our network learn invariance from id-unrelated and id-related augmentations. By jointly training the generative and the contrastive modules, our method achieves new state-of-the-art unsupervised person ReID performance on mainstream large-scale benchmarks.
translated by 谷歌翻译
Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.
translated by 谷歌翻译
Large Neighborhood Search (LNS) is a popular heuristic algorithm for solving combinatorial optimization problems (COP). It starts with an initial solution to the problem and iteratively improves it by searching a large neighborhood around the current best solution. LNS relies on heuristics to select neighborhoods to search in. In this paper, we focus on designing effective and efficient heuristics in LNS for integer linear programs (ILP) since a wide range of COPs can be represented as ILPs. Local Branching (LB) is a heuristic that selects the neighborhood that leads to the largest improvement over the current solution in each iteration of LNS. LB is often slow since it needs to solve an ILP of the same size as input. Our proposed heuristics, LB-RELAX and its variants, use the linear programming relaxation of LB to select neighborhoods. Empirically, LB-RELAX and its variants compute as effective neighborhoods as LB but run faster. They achieve state-of-the-art anytime performance on several ILP benchmarks.
translated by 谷歌翻译
Many datasets are biased, namely they contain easy-to-learn features that are highly correlated with the target class only in the dataset but not in the true underlying distribution of the data. For this reason, learning unbiased models from biased data has become a very relevant research topic in the last years. In this work, we tackle the problem of learning representations that are robust to biases. We first present a margin-based theoretical framework that allows us to clarify why recent contrastive losses (InfoNCE, SupCon, etc.) can fail when dealing with biased data. Based on that, we derive a novel formulation of the supervised contrastive loss (epsilon-SupInfoNCE), providing more accurate control of the minimal distance between positive and negative samples. Furthermore, thanks to our theoretical framework, we also propose FairKL, a new debiasing regularization loss, that works well even with extremely biased data. We validate the proposed losses on standard vision datasets including CIFAR10, CIFAR100, and ImageNet, and we assess the debiasing capability of FairKL with epsilon-SupInfoNCE, reaching state-of-the-art performance on a number of biased datasets, including real instances of biases in the wild.
translated by 谷歌翻译
在许多情况下,更简单的模型比更复杂的模型更可取,并且该模型复杂性的控制是机器学习中许多方法的目标,例如正则化,高参数调整和体系结构设计。在深度学习中,很难理解复杂性控制的潜在机制,因为许多传统措施并不适合深度神经网络。在这里,我们开发了几何复杂性的概念,该概念是使用离散的dirichlet能量计算的模型函数变异性的量度。使用理论论据和经验结果的结合,我们表明,许多常见的训练启发式方法,例如参数规范正规化,光谱规范正则化,平稳性正则化,隐式梯度正则化,噪声正则化和参数初始化的选择,都可以控制几何学复杂性,并提供一个统一的框架,以表征深度学习模型的行为。
translated by 谷歌翻译
自动分割前庭造型瘤(VS)和来自磁共振成像(MRI)的耳蜗可以促进与治疗计划。无监督的分割方法已显示出令人鼓舞的结果,而无需耗时且费力的手动标记过程。在本文中,我们提出了一种在无监督域的适应设置中进行VS和耳蜗分割的方法。具体而言,我们首先开发了跨站点的跨模式未配对的图像翻译策略,以丰富合成数据的多样性。然后,我们设计了一种基于规则的离线增强技术,以进一步最大程度地减少域间隙。最后,我们采用一个自我训练的自我配置分割框架,以获得最终结果。在Crossmoda 2022验证排行榜上,我们的方法已获得竞争性与耳蜗细分性能,平均骰子得分为0.8178 $ \ pm $ 0.0803和0.8433 $ \ pm $ 0.0293。
translated by 谷歌翻译
在过去的十年中,我们看到了工业数据,计算能力的巨大改善以及机器学习的重大理论进步。这为在大规模非线性监控和控制问题上使用现代机器学习工具提供了机会。本文对过程行业的应用进行了对最新结果的调查。
translated by 谷歌翻译